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Principles applied to reconstruction of the evolution of replication and genetic code indicate 
that the transition from binary, RNA ladder-replicators, reliant on purine self-recognition, to 
the double helix, with complementary purine-pyrimidine base pairs, was driven by changes 
at the codon-anticodon interface. These changes enabled expansion of the code, increasing 
the array of encoded amino acid residues incorporated into proteins. 
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Anti-parallel pentose-phosphate strands in the scaffold of an RNA, or DNA, double-helix were 
recently interpreted to represent the conserved vestige of an earlier, ladder-replicator (Davis 
2017, 2018a,b). Bi-directional H-bonds forming double-helix base pairs, exhibiting A ->• U/T 
->• A, G ->C ->G complementarity 1 (Watson and Crick, 1953a,b), were noted, in addition, to 
conform with sequence propagation by self-recognition, as in A ->• A, G -»G. A simple, direct 
mechanism of sequence propagation, consistent with ladder-replicator antiquity, resulted. 

The transition envisioned from propagation of a binary, linear base sequence to that of a 
quaternary, coiled sequence, with elevated complexity density (Fig. SI), conforms in a 
general sense with the anticipated direction of biological evolution. 

Adenosine-rich codons assigned to diacid amino acids and their amides were found to 
formed the first code , which was identified after equating the time-order of base triplet 
assignments with amino acid synthesis path-distance (Figs. S2, S3). This finding furnished 
evidence that translation originated from the pre-code assembly of random sequence 
oligopeptides of these amino acids (Asp 1 , Asn 2 , Glu 1 , Gin 2 ) on a poly(A) template (Fig. 1). 
Based on the path invariant principle derived during the former investigation (Davis, 2012, 
2015), it became apparent, as noted, that ladder replicators, reliant on purine self¬ 
recognition, preceded Watson-Crick purine-pyrimidine complementarity and double helix. 
These findings linked transition from propagation of a binary RNA sequence, based on 
purine self-recognition in a ladder replicator, and quaternary purine/pyrimidine sequence 
replicators, to changes at the codon-anticodon interface, enabling expansion of the genetic 
code and its array of encoded amino acids. 


1 A, adenosine; G, guanosine; U/T, urldine/thymldine; C, cytodine; purines (A, G) predating pyrimidines (U(T), 
C). 
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Fig- 1. Pre-code and transitional stages of translation, inferred from the NH 4 + Fixers 
Code. Adaptor core structure is designated by Latin letters within brackets. In 
pyrimidine-deficient stages, the proto-adaptor core is represented by a Greek symbol. 
Asn 2 adaptor core, a , differs from that of the core structure group, A, of the pre¬ 
code ladder adaptor, with which it competed for AAA codons. 

Clover-leaf structures of pre-code, diacid, and asparagine (Asn) adaptors in Fig.2 were 
constructed from consensus sequences for pre-divergence tRNA (Table S2). Retention of 
core structure groups throughout code formation, indicates the adaptors have not changed 
fundamentally over the intervening interval. 








Fig. 2. Depicts clover-leaf structure of pre-code, diacid and asparagine ladder 
adaptors. These structures were deduced from consensus tRNA sequence 
given in Table S2. 
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Fig. 3. (above) Shows transition transition from (a) ladder form of codon:anticodon pair AAA:AAA to 
transitional pairs (b) and (c) to type A RNA double helix in (d) with complete displacement of the initial 
AAA anticodon by the complementary pyrimidine, U. 


The ladder to double helix transition depicted in Fig. 3 with the replacement of A by its 
complementary pyrimidine in a stepwise manner from adaptor N34 to N36, accompanied increasing 
coding specificity. Coevolution of the proof-reading purines in the decoding center could be 
anticipated. 




Supplement 



Fig. SI. Comparison between RNA tetramers in a purine-ladder (a) and type A double-helix 
(b). Purine self-recognition pairs characterize the former, and Watson-Crick purine-pyrimidine 
complementarity governs pair formation in the latter. Arrows between designated bases 
indicate base pair H-bonds are bi-directional. Upper and lower arrows refer to the direction of 
the anti-parallel poly(ribose-phosphate) scaffold. Bar length, 1 A. 
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Fig. S2. Amino acid path-distances in reconstructed tRNA-dependent synthesis pathways. 
Number of reaction steps appear in the overbar. Oxaloacetate (oa) was precursor to fifteen 
amino acids, forming an enlarged, pre-enzyme aspartate super-family. Ketoglutarate (kg) 
produced four glutamate family amino acids and ribose-5-phosphate (rp) produced one. cc, 
refers to citrate cycle metabolite; and ct, to central trunk. An intermediate lacking a tRNA 
cofactor attachment site is represented by a white background. Three-letter amino acid 
abbreviations occur in the left-hand column; upper-case, single-letter amino acid 
abbreviations occur within pathways. Lower-case, double-letter abbreviations denote non¬ 
amino-acid intermediates (Michal, 1992): py, pyruvate, pe, phosphoenolpyruvate, gp, 2- 
phosphoglycerate, pg, 3-phosphoglycerate. Thr - ap, aspartyl-phosphate; as, aspartate-(3- 
semialdehyde; hs, homoserine; ph, o-phospho-homoserine. lie - kb, a-keto-butyrate; ab, a- 
aceto-a-hydroxy-butyrate; dv, a,(B-dihydroxy-isovalerate; kv, a-keto-isovalerate. Met-sh, o- 
succinyl-homoserine; he, homocysteine. Arg - ng, N-acetyl-glutamate; np, N-acetyl- 
glutamate-phosphate; ns, N-acetyl-glutamate-y -semialdehyde; no, N-acetyl-ornithine; or, 
ornithine; on, citrulline; rs, arginine-succinate. Lys - dl, a,(B-dihydropicolineate; pd, A 1 - 
piperdiene-2,6-dicarboxylate; sk, N-succinyl-s-keto- a-amino-pimelate; sa, N-succinyl- a,£- 
diamino-pimelate; dp, a,£-diamino-L-pimelate; sp, meso- a£-diamino-pimelate. Ala - nd, Glu 
amine-donor; Asn-like cofactor/adaptor. Val - al, a-aceto-lactate; dl, a,(B-dihydroxy- 
isovalerate;kl, a-keto-isovalerate. Leu - pm, a-isopropyl-malate; im, (B-isopropyl-malate; ic, a- 
keto-isocaproate. Ser-op, phospho-hydroxypyruvate; ps, phospho-serine. Trp - ah, (B- 
deoxy-arabino-heptulosonate-7-phosohate; dq, 5-dehydroquinate; ds, 5-dehydro-shikimate; 
sk, shikimate; kp, shikimate-5-phisohate; ps, 3-enolpyruvyl-shikimate-5-phosphate; ca, 
chorismate; aa, anthranilate; ra, N-phospho-ribosyl-anthranilate; or, 1-(o- 
carboxyphenylamino)-T-deoxyribulose-5-phosphate; ip, indole-3-glycerol-phosphate; in, 
indole. Phe - pf, prephenate; fp, phenyl-pyruvate. Tyr - hf, p-hydroxy-phenyl-pyruvate. Pro - 
gs, glutamate-y-semialdehyde; pc, A1-pyrroline-5'-carboxylate. His - pp, phosphatidyl- 
ribosyl-pyrophosphate; pt, phospho-ribosyl-adenosine-triphosphate; rm, phospho-ribosyl- 
adenosine-monophosphate; ro, phospho-ribosyl-formimino-amino-imidazole-carboxamide- 
ribose-phosphate; ru, phospho-ribulosyl-formimino-amino-imidazole-carboxamide-ribose- 
phosphate; ig, erythro-imidazole-gylcerol-phosphate; ia, imidazole-acetol-phosphate; hp, 
histidinol-phosphate; ho, histidinol. 
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Fig. S3. Depicts relationship between extension of tRNA-dependent amino acid synthesis 
pathways, from central metabolism, with inferred time-order of codon assignments during 
formation of the genetic code. Adapted from Davis (2018). 




















































Table SI. Known and new structural features encompassed by the path-distance model of 
genetic code evolution. A comprehensive list of code features, with interpretations, appears in 
references under this table. 

GENETIC CODE FEATURES UNIFIED BY PATH-DISTANCE MODEL 


Known 


New 


1. Woese, 1965: NAN, NUN aa hydropathy clusters. 

2. Bretscher et aL 1965: nonsense codon inhibition. 

3. Nirenberg et al. 1966: aa biosynthetic clusters. 

4. Dunnill, 1966: codon 4-sets have S'/mid-G, G. 

5. Crick, 1966: universality of standard code. 

6. Wilcox, Nirenberg, 1968: tRNA an aa cofactor. 

7. Rod well, 1969: He 7 path has four Va I 5 steps. 

8. Dillon, 1973: 4-sets predated 2- and 1-sets. 

9. Dillon, 1973, Wong, 1975: aa coevolved with tRNA. 

10. Dillon, 1973, Wachtershauser, 1992: aa synthesis 
paths formed by reductive organo-synthesis. 

11. Perlwitz et al. 1988: mid-base most coding capacity. 

12. Taylor, Coates, 1989: sibling aa share codon 5 p -base. 

13. Taylor, Coates, 1989: smallest aa assigned 4-sets. 

14. Garrett, Grisham, 1999: na-like aa have long paths. 

15. Lim, Curran, 2001: Y:Y wobble split eight c4-sets. 

16. Brooks et al. 2002: ancient proteins have early aa 

17. Brooks et al. 2002: early aa in ancient proteins. 

18. Brooks, Fresco, 2003: GNN code for early aa. 

19. Biro et al. 2003: codon R, Y mid-base aa clusters., 

20. Norgaard et al. 2009: reconstruction of Pro-Fd-5. 

21. Rodin et al., 2009. tRNA N2:N71 complementarity. 

22. Williams et al. 2009: Synhetase duality. 


1. Code comprises six domains of contiguous codons 
read by related pre-LUCA tRNA for same-family aa. 

2. Amino acid synthesis intermediates retain an invariant 
□ -carboxyl linked to early tRNA-cofactor attachment, 

3. Path-distances reveal codon bases were assigned to 
distinct kinds of aa in 5 p ^mid^3' order, 

4. Compact XAN codon set (X, coding site) first encoded 
four N-fixer aa (1-2 step paths) and a stop signal. 

5. First code places origin of proteins at two RCC N-fixer 
sites, yielding diacids Asp 1 , Glu 1 and amides Asn 2 , Gin 2 . 

6. Source duality is amplified in diacid function (aa source v. 
N donor), phylognetic depth, and synthetase class. 

7. Pre-LUCA tRNA identities indicate Asp 1 was initially 
precursor to 15 aa, and Glu 1 to only 3 aa. 

8. Asn 2 and Gin 2 attachment to tRNA blocked lactam 
formation by these early, labile aa. 

9. Mid-base was assigned in an (A)^C^G^U order to ten 
increasingly hydrophobic aa of ™4, 5, and 7 path-steps. 

10. Eight stable code-boxes (4-sets) were assigned (GCN 
to Orn G ) during expansion from the N-fixers code. 

11. 3'-Base encoded six basic/aromatic aa of 9-14 path- 
steps, by overprinting six error-prone boxes. 

12. tRNA-cofactor exchange led to anomalous assignment 
of UUR, CUN to Leu s , and AGR, CGN to Arg 9 . 


Davis, BK 2007. Making sense of the genetic code with the path-distance model. In, Leading-Edge Messenger RNA 
Research Communications Ed., MH Ostrovskiy. New York: Nova Science, Chp. 1, pp. 1-32. 

Davis, BK 2013. Making sense of the genetic code with the path-distance model based on tRNA-dependent pathwa 
https://archive.org/details/MakingSenseOfGeneticCode DOI: 10.13140./RG.2.2.17217.86888 






Table S2. LUCA tRNA sequence source: Davis B. K. 2008. Imprint of early tRNA diversification on the 
genetic code: Domains of contiguous codons read by related adaptors for sibling amino acids. In 
Messenger RNA Research Perspectives Ed. T. Takeyama. New York: Nova Biomedical Books Chp. 1, 
pp. 35-79. 
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